30 research outputs found

    Intelligent Robotic Perception Systems

    Get PDF
    Robotic perception is related to many applications in robotics where sensory data and artificial intelligence/machine learning (AI/ML) techniques are involved. Examples of such applications are object detection, environment representation, scene understanding, human/pedestrian detection, activity recognition, semantic place classification, object modeling, among others. Robotic perception, in the scope of this chapter, encompasses the ML algorithms and techniques that empower robots to learn from sensory data and, based on learned models, to react and take decisions accordingly. The recent developments in machine learning, namely deep-learning approaches, are evident and, consequently, robotic perception systems are evolving in a way that new applications and tasks are becoming a reality. Recent advances in human-robot interaction, complex robotic tasks, intelligent reasoning, and decision-making are, at some extent, the results of the notorious evolution and success of ML algorithms. This chapter will cover recent and emerging topics and use-cases related to intelligent perception systems in robotics

    ROAD: Learning an Implicit Recursive Octree Auto-Decoder to Efficiently Encode 3D Shapes

    Full text link
    Compact and accurate representations of 3D shapes are central to many perception and robotics tasks. State-of-the-art learning-based methods can reconstruct single objects but scale poorly to large datasets. We present a novel recursive implicit representation to efficiently and accurately encode large datasets of complex 3D shapes by recursively traversing an implicit octree in latent space. Our implicit Recursive Octree Auto-Decoder (ROAD) learns a hierarchically structured latent space enabling state-of-the-art reconstruction results at a compression ratio above 99%. We also propose an efficient curriculum learning scheme that naturally exploits the coarse-to-fine properties of the underlying octree spatial representation. We explore the scaling law relating latent space dimension, dataset size, and reconstruction accuracy, showing that increasing the latent space dimension is enough to scale to large shape datasets. Finally, we show that our learned latent space encodes a coarse-to-fine hierarchical structure yielding reusable latents across different levels of details, and we provide qualitative evidence of generalization to novel shapes outside the training set.Comment: Accepted to Conference on Robot Learning (CoRL), 202

    Особливості педагогічного виміру навчальних досягнень студентів

    Get PDF
    (uk) У статті висвітлюються проблеми педагогічного тестування, його переваги та недоліки, технології створення тестів; здійснено аналіз вимог до якості тестів та умов ефективного впровадження тестового контролю знань, подано класифікацію тестових завдань, проаналізовані помилки змісту тестів та організації тестування.(ru) В статье раскрываются проблемы педагогического тестирования, его достоинства и недостатки, технологии создания тестов; сделан анализ требований к качеству и условиям эффективного применения тестового контроля знаний, представлена классификация тестовых заданий, проанализированы содержательные ошибки тестов и организации самого процесса тестирования

    ShaSTA: Modeling Shape and Spatio-Temporal Affinities for 3D Multi-Object Tracking

    Full text link
    Multi-object tracking is a cornerstone capability of any robotic system. Most approaches follow a tracking-by-detection paradigm. However, within this framework, detectors function in a low precision-high recall regime, ensuring a low number of false-negatives while producing a high rate of false-positives. This can negatively affect the tracking component by making data association and track lifecycle management more challenging. Additionally, false-negative detections due to difficult scenarios like occlusions can negatively affect tracking performance. Thus, we propose a method that learns shape and spatio-temporal affinities between consecutive frames to better distinguish between true-positive and false-positive detections and tracks, while compensating for false-negative detections. Our method provides a probabilistic matching of detections that leads to robust data association and track lifecycle management. We quantitatively evaluate our method through ablative experiments and on the nuScenes tracking benchmark where we achieve state-of-the-art results. Our method not only estimates accurate, high-quality tracks but also decreases the overall number of false-positive and false-negative tracks. Please see our project website for source code and demo videos: sites.google.com/view/shasta-3d-mot/home.Comment: 8 pages, 3 figure

    Learning Optical Flow, Depth, and Scene Flow without Real-World Labels

    Full text link
    Self-supervised monocular depth estimation enables robots to learn 3D perception from raw video streams. This scalable approach leverages projective geometry and ego-motion to learn via view synthesis, assuming the world is mostly static. Dynamic scenes, which are common in autonomous driving and human-robot interaction, violate this assumption. Therefore, they require modeling dynamic objects explicitly, for instance via estimating pixel-wise 3D motion, i.e. scene flow. However, the simultaneous self-supervised learning of depth and scene flow is ill-posed, as there are infinitely many combinations that result in the same 3D point. In this paper we propose DRAFT, a new method capable of jointly learning depth, optical flow, and scene flow by combining synthetic data with geometric self-supervision. Building upon the RAFT architecture, we learn optical flow as an intermediate task to bootstrap depth and scene flow learning via triangulation. Our algorithm also leverages temporal and geometric consistency losses across tasks to improve multi-task learning. Our DRAFT architecture simultaneously establishes a new state of the art in all three tasks in the self-supervised monocular setting on the standard KITTI benchmark. Project page: https://sites.google.com/tri.global/draft.Comment: Accepted to RA-L + ICRA 202

    Where's Waldo at time t? Using spatio-temporal models for mobile robot search

    Get PDF
    We present a novel approach to mobile robot search for non-stationary objects in partially known environments. We formulate the search as a path planning problem in an environment where the probability of object occurrences at particular locations is a function of time. We propose to explicitly model the dynamics of the object occurrences by their frequency spectra. Using this spectral model, our path planning algorithm can construct plans that reflect the likelihoods of object locations at the time the search is performed. Three datasets collected over several months containing person and object occurrences in residential and office environments were chosen to evaluate the approach. Several types of spatio-temporal models were created for each of these datasets and the efficiency of the search method was assessed by measuring the time it took to locate a particular object. The results indicate that modeling the dynamics of object occurrences reduces the search time by 25% to 65% compared to maps that neglect these dynamics
    corecore